Bermuda: Bidirectional de novo assembly of transcripts with new insights for handling uneven coverage
نویسندگان
چکیده
Motivation: RNA-seq has made feasible the analysis of a whole set of expressed mRNAs. Mapping-based assembly of RNA-seq reads sometimes is infeasible due to lack of highquality references. However, de novo assembly is very challenging due to uneven expression levels among transcripts and also the read coverage variation within a single transcript. Existing methods either apply de Bruijn graphs of single-sized k-mers to assemble the full set of transcripts, or conduct multiple runs of assembly, but still apply graphs of single-sized k-mers at each run. However, a single k-mer size is not suitable for all the regions of the transcripts with varied coverage. Contribution: This paper presents a de novo assembler Bermuda with new insights for handling uneven coverage. Opposed to existing methods that use a single k-mer size for all the transcripts in each run of assembly, Bermuda selfadaptively uses a few k-mer sizes to assemble different regions of a single transcript according to their local coverage. As such, Bermuda can deal with uneven expression levels and coverage not only among transcripts, but also within a single transcript. Extensive tests show that Bermuda outperforms popular de novo assemblers in reconstructing unevenly-expressed transcripts with longer length, better contiguity and lower redundancy. Further, Bermuda is computationally efficient with moderate memory consumption. Correspondence should be addressed to Dr. Xu
منابع مشابه
Clustering of Short Read Sequences for de novo Transcriptome Assembly
Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...
متن کاملIDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels
MOTIVATION RNA sequencing based on next-generation sequencing technology is effective for analyzing transcriptomes. Like de novo genome assembly, de novo transcriptome assembly does not rely on any reference genome or additional annotation information, but is more difficult. In particular, isoforms can have very uneven expression levels (e.g. 1:100), which make it very difficult to identify low...
متن کاملOptimization of de novo transcriptome assembly from next-generation sequencing data.
Transcriptome analysis has important applications in many biological fields. However, assembling a transcriptome without a known reference remains a challenging task requiring algorithmic improvements. We present two methods for substantially improving transcriptome de novo assembly. The first method relies on the observation that the use of a single k-mer length by current de novo assemblers i...
متن کاملSelecting Superior De Novo Transcriptome Assemblies: Lessons Learned by Leveraging the Best Plant Genome
Whereas de novo assemblies of RNA-Seq data are being published for a growing number of species across the tree of life, there are currently no broadly accepted methods for evaluating such assemblies. Here we present a detailed comparison of 99 transcriptome assemblies, generated with 6 de novo assemblers including CLC, Trinity, SOAP, Oases, ABySS and NextGENe. Controlled analyses of de novo ass...
متن کاملA consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads
MOTIVATION Novel high-throughput sequencing technologies pose new algorithmic challenges in handling massive amounts of short-read, high-coverage data. A robust and versatile consensus tool is of particular interest for such data since a sound multi-read alignment is a prerequisite for variation analyses, accurate genome assemblies and insert sequencing. RESULTS A multi-read alignment algorit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015